Skip to content

Add rev10: regression-safe max-azimuth refactor and validation harness#6

Open
nicklasorte wants to merge 1 commit intomainfrom
codex/create-optimized-version-of-subchunk_agg_check_maxazi
Open

Add rev10: regression-safe max-azimuth refactor and validation harness#6
nicklasorte wants to merge 1 commit intomainfrom
codex/create-optimized-version-of-subchunk_agg_check_maxazi

Conversation

@nicklasorte
Copy link
Copy Markdown
Owner

Motivation

  • Speed up the hot aggregation path while keeping strict per-iteration RNG identity unchanged so rev9 vs rev10 comparisons remain deterministic and auditable.
  • Remove inefficient azimuth chunking and repeated partial max scans while preserving function contract, units, and numerical behavior.
  • Provide a strong, reproducible validation harness to prove rev10 matches rev9 within a tight tolerance before any RNG/vectorization changes in a later revision.

Description

  • Added subchunk_agg_check_maxazi_rev10.m, a refactor of rev9 that preserves the MC seeding pattern (rng(rand_seed1+mc_iter), +1, +2) and keeps the MC loop to maintain RNG stream semantics while removing the azimuth chunk loop and repeated partial max scans; aggregation now computes dBm -> watts -> sum -> dBm across the full azimuth set and takes a single max per MC iteration.
  • Precomputes the off-axis gain matrix once (same nearest-neighbor behavior as rev9) and gates optional expensive NaN checks behind DEBUG_CHECKS=false to avoid hot-path cost.
  • Reduces temporary allocation overhead in the hot path by avoiding chunk slicing and redundant max operations while keeping numeric operations and shapes identical to rev9.
  • Added validate_subchunk_agg_check_maxazi_rev10.m, a deterministic regression harness that builds fixed synthetic cases (small, medium, large), runs both subchunk_agg_check_maxazi_rev9 and subchunk_agg_check_maxazi_rev10 on identical inputs, checks shape/NaN/Inf patterns, reproducibility, computes max_abs_diff and max_rel_diff, enforces a tight tolerance (tol = 1e-10), measures runtime (using timeit where available or tic/toc), and fails fast on any mismatch.

Testing

  • Created and inspected both new files (subchunk_agg_check_maxazi_rev10.m and validate_subchunk_agg_check_maxazi_rev10.m) using file-printing/smoke commands (nl -ba), which succeeded in this environment.
  • The validation harness is fully automated: it runs both rev9 and rev10 on three deterministic cases, checks size, column-shape [num_mc_idx x 1], NaN/Inf pattern equality, reproducibility, computes max_abs_diff and max_rel_diff (denominator max(abs(out_rev9),1e-12)), uses tol = 1e-10, measures runtime with timeit fallback to tic/toc, prints a structured summary, and raises an error on failure.
  • Note: the harness was not executed to completion inside this CI environment because no MATLAB/Octave interpreter was available here; the harness is ready to run in a MATLAB/Octave environment and will produce PASS/FAIL and runtime speedup numbers when executed there.

Codex Task

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9bee82b46

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +103 to +106
for loop_idx=1:1:num_mc_idx
base_mc=sort_monte_carlo_pr_dBm_all(:,loop_idx);
sort_temp_mc_dBm=base_mc+off_axis_gain_matrix;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reinstate bounded azimuth batching in MC aggregation

This revision computes base_mc + off_axis_gain_matrix for the full azimuth set on every Monte Carlo iteration, which raises peak temporary memory from O(num_bs*azi_chunk) in rev9 to O(num_bs*num_sim_azi). For larger production shapes (many BS rows and fine beamwidths), this can trigger out-of-memory failures on workers even though rev9’s chunked loop completed, so the refactor is not regression-safe under high-load inputs.

Useful? React with 👍 / 👎.

Comment on lines +40 to +43
abs_diff = abs(out_rev9_a - out_rev10_a);
rel_diff = abs_diff ./ denom;
max_abs_diff = max(abs_diff,[],'all');
max_rel_diff = max(rel_diff,[],'all');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mask non-finite entries before computing max diffs

The harness computes abs(out_rev9_a - out_rev10_a) and then takes max(...) over all elements; when both outputs contain matching Inf or NaN entries, operations like Inf-Inf produce NaN, causing max_abs_diff/max_rel_diff to become NaN and the final tolerance check to fail even though nan_pattern_match/inf_pattern_match are true. This yields false FAIL results for valid equivalence cases with non-finite outputs.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant